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Ancestral graphs can encode conditional independence relations 
that arise in directed acyclic graph (DAG) models with latent and se- 
, lection variables. However, for any ancestral graph, there may be sev- 

£Nj ■ eral other graphs to which it is Markov equivalent. We state and prove 

conditions under which two maximal ancestral graphs are Markov 
equivalent to each other, thereby extending analogous results for 
DAGs given by other authors. These conditions lead to an algorithm 
for determining Markov equivalence that runs in time that is poly- 
, nomial in the number of vertices in the graph. 

1. Introduction. A graphical Markov model is a set of distributions with 
independence structure described by a graph consisting of vertices and edges. 
The independence model associated with a graph is the set of conditional 
' independence relations encoded by the graph through a global Markov prop- 

erty. In general, different graphs may encode the same independence model. 
^ \ In this paper, we consider a particular class of graphs, called ancestral 

C<") ' graphs, and characterize when two graphs encode the same sets of con- 

qq . ditional independence relations. 

| The class of ancestral graphs is motivated in the following way. We sup- 

pose our observed data were generated by a process represented by a directed 
acyclic graph (DAG) with a fixed set of variables. The causal interpretation 
of such a DAG is described by [18] and [14]. However, in general, we may 
^ . only have observed a subset of these variables in a specific sub-population. 

Hence, some variables in the underlying DAG are not observed ("latent"), 
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Fig. 1. (i) A seemingly unrelated regression model and (ii) a Markov equivalent DAG 
model. 



while other variables, specifying the specific sub-population from which our 
data were sampled, are conditioned upon ("selection variables"). 

Even though the underlying model is a DAG, the conditional indepen- 
dence structure holding among the observed variables, conditional on the 
selection variables, cannot always be represented by a DAG containing only 
the observed variables. For this purpose, the more general class of ancestral 
graphs is required [see Figure 2 (ii) and Definition 2.1]. The statistical mod- 
els associated with ancestral graphs retain many of the desirable properties 
that are associated with DAG models. 

Like DAGs, two different ancestral graphs can represent the same set of 
conditional independence relations, and hence distributions. Such graphs 
are said to be Markov equivalent. A graphical characterization of the cir- 
cumstances under which graphs are Markov equivalent is of importance for 
several reasons: 

• Markov equivalent graphs lead to identical likelihoods because the sets 
of distributions obeying the Markov property associated with the graphs 
are the same. Thus, for the purposes of interpreting a model, it is often 
important to characterize those features that are common to all the graphs 
in a given class (see [18] and [13]). 

• When viewed as a Gaussian path diagram (see [15], Section 8.1), different 
(maximal) ancestral graphs correspond to different parametrizations of 
the same Gaussian Markov model. However, some parametrizations may 
be simpler to fit than others. For example, the model corresponding to 
the graph in Figure l(i), in the Gaussian case, is an example of a seem- 
ingly unrelated regression (SUR) model (see [23]). In general, there are 
no closed form expressions for the MLEs for SUR models, iterative fitting 
methods are required and there may be multiple solutions to the likelihood 
equations (see [8]). However, the graph in Figure l(i) is Markov equivalent 
to Figure 1 (ii) , which is a DAG. Gaussian DAG models have closed form 
MLEs, and the likelihood is unimodal (see [12]). Consequently, none of 
the problems which may arise for general Gaussian SUR models apply to 
the specific model corresponding to Figure l(i) (see also [7]). 
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(i) (ii) 

Fig. 2. (i) A DAG with a latent variable H . (ii) The ancestral graph resulting from 
marginalizing over H includes a bi-directed edge between Pep and CD4- 



In this paper, we provide necessary and sufficient graphical conditions un- 
der which two ancestral graphs are Markov equivalent. Though other char- 
acterizations have been given previously in [24] and [19], the criterion given 
here is the first which leads to an algorithm that runs in time polynomial 
in the size of the graph. Reference [22] solved the Markov equivalence prob- 
lem for DAGs. References [2, 3] and [9] solved the problem of representing 
Markov equivalence classes for DAGs, which we leave for future work. 

Section 2 defines the class of ancestral graphs and outlines the motivation 
for the class. Section 3 contains the main result of the paper. Discussion and 
relation to prior work are in Section 4. The Appendix contains algorithmic 
details. 



2. Ancestral graphs. The basic motivation for ancestral graphs is to en- 
able one to model the independence structure over the observed variables 
that results from a DAG containing latent or selection variables without 
explicitly including such variables in the model. To illustrate this, consider 
the DAG shown in Figure 2(i) in which Azt, Pep, Ap and CD4 are observed 
variables, while H is unobserved. Azt and Ap represent treatments given to 
AIDS patients (see Robins [17], Section 2). Pep is an opportunistic infection 
that often afflicts AIDS patients, and CD4 can be viewed as a measure of 
disease progression. Supposing development of Pep was a side-effect of tak- 
ing Azt, then the DAG given in Figure 2(i) incorporates the assumption that 
Azt and Ap are both randomized, Pep and CD4 are responses correlated 
by underlying health status H, and, further, that Azt does not affect CD4- 
The DAG implies the following conditional independence relations over the 
observed variables: 

AztALAp, CD 4, Ap JL Azt, Pep. 

These relations can be derived from the DAG in Figure 2(i) via d-separation 
(see [12] or [22]). Also, note that other valid independence statements, such 
as Azt JL CD4 , can be derived from the two statements given above. The 
corresponding ancestral graph that represents these same conditional inde- 
pendence relations is shown in Figure 2(h). (See Section 2.2 for the definition 
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of an ancestral graph and Section 2.3 for the Markov property.) However, 
there is no DAG on the four observed variables which represents all and only 
these conditional independence relations. 

As this example suggests, bi-directed edges (~<—y) may arise from unob- 
served parents. Likewise, undirected edges ( ) may arise from children 

that have been conditioned on in the selected sub-population from which 
the sample is taken (see [4] and [5]). However, bi-directed and undirected 
edges may also arise in other contexts, where both marginalization and con- 
ditioning are present. Reference [16] provides a detailed discussion on the 
interpretation of edges in an ancestral graph. 

2.1. Basic graphical notation and terminology. We use the following ter- 
minology to describe relations between vertices in a mixed graph Q, which 
may contain three types of edge. 



a- 



If 



a—yb 
a^ — b , 



{neighbor f a£ neg(b) 

spouse I Q £ ^ an j J a € spg(6) 

parent a € pa^ [b) 

child ) \ a € chg(6) 



(For a formal set-theoretic definition of mixed graphs see [15], Appendix.) 
Two vertices that are connected by some edge are said to be adjacent. Note 
that the three edge types should be considered as distinct symbols, and that 
all the mixed graphs we consider in this paper are simple in that they have 
at most one edge between each pair of vertices. If there is an edge a — yb or 
a~<—yb, then there is said to be an arrowhead at b on this edge. Conversely, 

if there is an edge a — yb or a b, then there is said to be a tail at a. We 

also do not allow a vertex to be adjacent to itself. We restrict attention to 
graphs with finite vertex sets. 

A path tt between two vertices x and y in a simple mixed graph Q is a 
sequence of distinct vertices tt = (x, v\, . . . , Vk, y) such that each vertex in 
the sequence is adjacent to its predecessor and its successor; x and y are the 
endpoints of 7r; all other vertices on the path are nonendpoints of tt. If a and 
b are distinct vertices on tt, then the portion of 7r between a and b is called 
a section of tt, denoted ir(a, b). Note that we use both ir{a,b) and 7r(6, a) 

to represent the same section of tt. A path of the form x — > — >-y, on 

which every edge is of the form — y, with the arrowheads pointing toward 
y, is a directed path from x to y. A directed path from x to y, together with 
an edge y — yx G Q, is called a directed cycle. 



2.2. Definition of ancestral graphs. DAGs are directed graphs in which 
directed cycles are not permitted. Similarly, certain configurations of edges 
are not permitted in ancestral graphs: 



MARKOV EQUIVALENCE FOR ANCESTRAL GRAPHS 



5 



Definition 2.1. A graph, which may contain undirected ( ), di- 
rected ( — y) or bi-directed edges (-<— y) is ancestral if: 

(a) there are no directed cycles; 

(b) whenever there is an edge x~<—yy, then there is no directed path from 
x to y, or from y to x; 

(c) if there is an undirected edge x y then x and y have no spouses or 

parents. 

Conditions (a) and (b) may be summarized by saying that, if x and y are 
joined by an edge and there is an arrowhead at x, then x is not an ancestor 
of y; this is the motivation for the term "ancestral." 

A vertex a is said to be an ancestor of a vertex b if either there is a 

directed path a — > yb from a to b or a = b. Further, if a is an ancestor 

of 6, then b is said to be a descendant of a. 

A vertex a is said to be anterior to a vertex b if a = b or there is a path 

fj, between a and 6, on which every edge is either of the form c d or 

c — yd, with (f between c and 6 on /z; such a path /i. is said to be an anterior 

path from a to 6. By (c) in Definition 2.1, the configuration — yc never 

occurs in an ancestral graph; hence, every anterior path takes the form 

a c — > yb, 

where a = c and c = b are possible. We use an(x), de(x) and ant{x) to 
denote, respectively, the ancestors of x, the descendants of x and the vertices 
anterior to x. We apply these definitions disjunctively to sets. For example, 

&n(X) = {a | a is an ancestor of b for some b € X}, 

ant(X) = {a \ a is anterior to b for some b G X}. 

By definition, X C an(A) C ant(X). Note that every DAG is an ancestral 
graph, since clauses (b) and (c) are trivially satisfied. 

In the next lemma and elsewhere, we will make use of the shorthand 
notation xi—yy to indicate that either x — yy or x^—yy. Similarly, xi — y 
indicates that either x< — y or x y, while xi—iy indicates any edge. 

Lemma 2.2. Let a, b, c be vertices in an ancestral graph Q with a and c 
adjacent. If ai—yb — yc, then a?—yc. In particular, if the edge ends at a on 
the (a,b) and (a,c) edges differ, then we have c^ — a<—yb; otherwise, either 
c< — a — yb, or c^—ya^—yb. 

We make use of this property in Sections 3.7 and 3.9. 

Proof of Lemma 2.2. Suppose, for a contradiction, that there is a 
tail at c on the (a,c) edge. Since, by hypothesis, there is an arrowhead at 
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b on the (b, c) edge, a c is ruled out by Definition 2.1(c), so a< — c. But 

then Q violates Definition 2.1(b), since ai—yb — yc — ya. Hence, ai— yc. The 
conclusion then follows from noting that the configuration a — yb — yc~<—ya 
is not ancestral. □ 

2.3. The to- separation criterion. In an ancestral graph, a nonendpoint 
vertex v on a path is said to be a collider if two arrowheads meet at v 
(i.e., — MH — , -<— MM— y, -<— MM — or — ^B-ff). All other nonendpoint 

vertices on a path are noncolliders (i.e., v , v — y, — yv — y, 

-< — v — y, or -<— M — y). These definitions of collider and noncollider are 
direct extensions of the corresponding definitions for DAGs. A path along 
which every nonendpoint is a collider is called a collider path. A path com- 
prised of 3 vertices is called a triple. In an ancestral graph, a triple is either 
a collider or a noncollider; we refer to this as the type of the triple. Hence, 
if (a, b, c) forms a triple, then (c, b, a) and (a, b, c) are of the same type. 

Reference [22] introduced d-separation, a set of graphical conditions by 
which conditional independence relations could be read from a DAG. Refer- 
ence [15] applied a natural extension of Pearl's d-separation criterion, called 
m-separation, to ancestral graphs. 

Definition 2.3. Let a and b be distinct vertices in an ancestral graph 
Q, and let Z be a subset of vertices with a,b ^ Z. A path tv between a and 
b is said to be m- connecting given Z if the following hold: 

(i) no noncollider on 7r is in Z; and, 

(ii) every collider on 7r is an ancestor of a vertex in Z. 

Two vertices a and b are said to be m-separated given Z in Q if there 
is no path m-connecting a and b given Z in Q. Likewise, sets A and B are 
m-separated given Z in Q if, for every pair a € A and b € B, a and b are 
m-separated given Z. 

For example, in the ancestral graph in Figure 2 (ii) , Azt and Ap are m- 
separated given CD4- Definition 2.3 is an extension of the original definition 
of d-separation for DAGs in that the notions of "collider" and "noncollider" 
now allow for bi-directed and undirected edges; the definition of ancestor 
is unchanged. Furthermore, d-separation is equivalent to m-separation for 
DAGs. The following result is useful. 

Lemma 2.4. In an ancestral graph Q, if is is a path m-connecting a and 
b given Z, c is on it (a^c^b) and there is an arrowhead at c on the section 
7r(a,c), then either cG &n(Z) or 7v(c,b) is a directed path from c to b. 
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Proof. Suppose the result is false. Let c be the vertex closest to b 
satisfying the premise of the lemma but not the conclusion. If c is a collider 
on 7r, then, by definition of m-connection, c G an(Z) which is a contradiction. 
Let d be the vertex after c on 7r(c, b). If c is a noncollider on tv then, by 
Definition 2.1(c), c — yd. If d G an(Z) or Tv(d,b) forms a directed path from 
d to 6, then, clearly, c satisfies the conclusion of the lemma. But, if d ^ 
an(Z) and 7r(d, 6) is not a directed path to 6, then d satisfies the premise 
of the lemma (and hence, c is not the closest such vertex to b), again a 
contradiction. □ 

2.4. Formal independence models. An independence model over a finite 
set V is a set 3 of ternary relations (X, Y | Z) where X, Y and Z are dis- 
joint subsets of V, while X and Y are not empty; the first two arguments are 
treated symmetrically, so that {X, Y \ Z) G 3 iff (Y, X [ Z) G J. The interpre- 
tation of (X, y | Z) G J is that X and 1" are independent given Z [see [20], 
Chapter 2]. The independence model associated with an ancestral graph, 
3 m (Q), is defined via m-separation as follows 

3 m {Q) = {(X, Y | Z)\X is m-separated from Y given Z in Q}. 

The independence relations in 3 m (G) comprise the global Markov property 
for Q. 

2.5. Probability distributions obeying a formal independence model. We 
associate a set of probability distributions with a formal independence model 
3 by using the finite set V to index a collection of random variables (X u ) ue v 
taking values in probability spaces (£l u ) U £v In all the examples we consider, 
the probability spaces are either real finite-dimensional vector spaces or 
finite discrete sets. For A C V, we let Qa = X veA (£l u ), O = £ly and Xa = 
(X U ) U £A- We will assume the existence of regular conditional probability 
measures throughout. 

A distribution P on £1 is said to obey the independence model 3 over V 
if, for all disjoint sets A,B,Z (A and B are not empty), 

{A,B\Z)e3 => AALB\Z[P], 

where we have used the (_U_) notation of [6], and the usual shorthand that A 
denotes both a vertex set and the random variable Xa- Thus, a distribution 
P obeys 3 m {Q) if, for all disjoint subsets of V, say X, Y, Z (X and Y not 
empty), 

X is m-separated from Y given Z in Q ==? X -\LY \ Z[P]. 

Note that, if P obeys 3, there still may be independence relations that are 
not in 3 that also hold in P. 
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(a) (b) 
c ■* *• d c * ► d 

a o a < ► b 

Fig. 3. (a) The path (a,c,d,b) is an example of an inducing path in an ancestral graph. 
(b) A maximal ancestral graph Markov equivalent to (a). 

2.6. Marginalizing and conditioning. In Section 4.1 of [15] operations 
of marginalizing and conditioning are introduced for formal independence 
models. If P obeys 3 and 3* is the independence model obtained by for- 
mally marginalizing over variables in L and conditioning on variables in S, 
then -P(-Xy\(Lus) I obeys the independence model 3* [P(Xs) a.e.] (see 
Theorem 7.1 of [15], Appendices A and B of [10]). 

In Section 4.2 of [15], a graphical transformation corresponding to marginal- 
izing and conditioning is given such that the independence model associ- 
ated with the transformed graph is the independence model obtained by 
marginalizing and conditioning the independence model 3 m (G) of the orig- 
inal graph (see Theorem 4.18 in [15]). Thus, in particular, if Q is a DAG 
with observed variables O, latent variables L and selection variables S, then 
the ancestral graph formed by the graphical transformation applied to Q 
represents those conditional independence relations implied to hold among 
the observed variables O, conditional on the selection variables [P(Xs) a.e.]. 



3. Markov equivalence. We introduce the following. 



Definition 3.1. Two ancestral graphs Gi and Q% with the same vertex 
set are said to be Markov equivalent, denoted Q\ ~ Q2, if for all disjoint sets 
A, B, Z (A, B not empty), A and B are m-separated given Z in Gi if and 
only if A and B are m-separated given Z in Q2] that is, 3 m (Gi) = 3 m {G2)- 

The graphs in Figure 3 are Markov equivalent, as are Gi and G2 in Figure 
4. The set of all ancestral graphs that encode the same set of conditional 
independence statements forms a Markov equivalence class. 

3.1. Markov equivalence for DAGs. References [9] and [22] gave simple 
graphical conditions for determining whether two DAGs are Markov equiv- 
alent. A triple of vertices (a,b,c) is said to be unshielded if a and c are not 
adjacent and shielded otherwise. (A triple is defined in Section 2.3.) 

Theorem 3.2. Two DAGs are Markov equivalent if and only if they 
have the same adjacencies and the same unshielded colliders. 
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That two Markov equivalent DAGs have the same adjacencies is a direct 
consequence of the fact that DAGs satisfy a pairwise Markov property. 

Proposition 3.3 ([12], page 50). In a DAG V, if a and b are not 
adjacent and b £ an(a), then a is d-separated from b by V \ (de(6) U {a}). 

This is a consequence of the local Markov property for DAGs [12] , applied 
to b, which implies that b is d-separated from V \ (pa(fe) U de(6)) by pa(6); 
b an(a) implies a £ de(6) and pa(6) C V \ (de(6) U {a}). Note that, by 
acyclicity, for any pair a, 6, either b £ an(a) or a ^ an(6). Consequently, 
in a DAG, every missing edge implies a conditional independence between 
the nonadjacent vertices. In general, no such pairwise property holds for 
ancestral graphs. For example, there is no set that m-separates a and b in 
the graph in Figure 3(a). This motivates the following section. 

3.2. Maximal ancestral graphs. 

Definition 3.4. An ancestral graph Q is said to be maximal if, for 
every pair of nonadjacent vertices (a, b), there exists a set Z (a,b£ Z) such 
that a and b are m-separated conditional on Z. 

These graphs are maximal in the sense that no additional edge may be 
added to the graph without changing the associated independence model. In 
a nonmaximal ancestral graph two nonadjacent vertices a and b, for which 
no m-separating set Z exists, will be joined by an inducing path. 

Definition 3.5. An inducing path iv between vertices a and b in an 
ancestral graph Q is a path on which every nonendpoint vertex is both a 
collider on 7r and an ancestor of at least one of the endpoints, a, b. 

For a proof, see [15], Corollary 4.3, where the definition given here is 
termed a "primitive" inducing path; the concept was introduced by Verma 
and Pearl [21]. Note that, strictly speaking, an inducing "path" 7r = (a,vi, . . . , 
Vk,b) is a collection of paths: the collider path 7r, together with directed 




b 




h 



x 



X 



Fig. 4. Qi, Q2, Qz have the same adjacencies and the same unshielded colliders, but Q\ 
and Qz are not Markov equivalent, tt — (x,q,b,y) forms a discriminating path for b in 
every graph. 
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paths from each vertex Vi, 1 < i < k, to one of the endpoints. The name 
"inducing" path refers to the fact that given any set Z (a, 6 ^ Z) ir is re- 
connecting given Z. If there is some vertex £ an(Z), then there is an 
m-connecting path involving one or more of the directed paths, otherwise 
the path 7r itself is m-connecting. 

Figure 3(a) shows an example of a nonmaximal ancestral graph. The path 
(a, c, d, b) forms an inducing path between a and b. By adding the bi-directed 
edge a~<—yb, the graph is made maximal without changing the associated 
independence model (which is empty), as shown in Figure 3(b). As is the 
case in this example, in general, if iv = (a, v\, . . . , i>k, b) is an inducing path, 
then only a bi-directed edge a~<—yb may be added while obeying Definition 
2.1. Since there are arrowheads present at a and 6, adding an undirected 
edge is ruled out by (c); adding a directed edge would violate (b) since we 
would either have a~<— >-V\ — > >-b — >-a or b^—yvk — >- >~a — yb. 

By [15], Theorem 5.1, for every nonmaximal ancestral graph Q there is a 
unique maximal ancestral graph Q of which it is a subgraph; in fact, Q = 
G[ and thus Q may be constructed in polynomial time. Consequently, the 
problem of characterizing Markov equivalence for ancestral graphs naturally 
reduces to that of characterizing equivalence in the case where both graphs 
are maximal. Except where noted, in the remainder of this paper, we will 
restrict attention to maximal ancestral graphs (MAGs). 

3.3. Necessary conditions for Markov equivalence. 

Proposition 3.6. If Gi, G2 are MAGs and Gi ~ G2, then Gi and G2 
have the same adjacencies and unshielded colliders. 

Proof. Since Gi is maximal, for each pair of nonadjacent vertices (x,y) 
in Gi, there is some set Z such that x and y are m-separated given Z in 
G\- If x and y are adjacent in G2, then they are not m-separated by Z, 
contradicting Gi ~ Gi- So, adjacencies in Gi are a subset of those in G2- By 
a symmetric argument, the adjacencies in G2 are a subset of those in Gi- 

Suppose, for a contradiction, that (a, b, c) is an unshielded collider in Gi 
but not in G2- Since Gi is maximal, for some set Z, a and c are m-separated 
by Z, and b £ Z. If (a, b, c) is a noncollider in G2 then a and c are m-connected 
given Z, which is a contradiction. Hence, every unshielded collider in Gi is 
present in G2- The conclusion follows by symmetry. □ 

An important consequence of this proposition is that if Gi and G2 are 
maximal and Markov equivalent, then a sequence of vertices forming a path 
in Gi also forms a path in G2 and vice-versa, though the edge-types on these 
paths may differ. Consequently, when Gi ~ G2, we will often refer to the path 
7r* in G2 corresponding to a given path 7r in Gi- 
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A key difference between DAGs and MAGs is that having the same ad- 
jacencies and the same unshielded colliders, though necessary, are no longer 
sufficient for Markov equivalence. Consider the graphs shown in Figure 4. 
Qi and Q3 contain the same adjacencies and the same unshielded colliders, 
but these two graphs are not Markov equivalent to each other. In Qi, x 
is m-separated from y given q; but according to £3, x is m-connected to 
y given q. In fact, in any graph Markov equivalent to Q\, (q,b,y) forms a 
shielded collider. (There is only one such graph, Q2, so {Gi,02} forms a 
Markov equivalence class.) However, in general, it is clearly not necessary 
that two graphs have all of the same shielded colliders in order for them 
to be Markov equivalent. Much of the remainder of this paper will focus on 
identifying the "relevant" set of colliders for judging Markov equivalence. 
The main result of this paper follows. 

Theorem 3.7. IfGi, Q2 ar e MAGs, then Q\ ~Q 2 if and only if Q\ and 
Q2 have the same adjacencies and the same colliders with order. 

The set of "colliders with order" within a graph is defined recursively in 
Definition 3.11 in the next section. The proof concludes in Section 3.10. 

3.4. Discriminating paths in maximal ancestral graphs. A discriminating 
path, if present in two Markov equivalent MAGs, implies that a certain 
shielded triple will be of the same type in both graphs. 

Definition 3.8 [18]. A path n = (x,q±, . . . ,q p ,b,y) (p > 1) is a discrim- 
inating path for (q p ,b,y) in a MAG Q if: 

(i) x is not adjacent to y, and, 

(ii) every vertex qi (1 < i <p) is a collider on n, and a parent of y. 

We will often refer to a section n(x,y) of some path 7r as a discriminating 
path for b, thereby implicitly specifying the triple (q p ,b,y) = n(q p ,y). By 
convention, we order the endpoints of the discriminating path so it is the 
second endpoint (in this case, y) which is in the discriminated triple. We are 
free to order x and y in this way, since, in our notation, Tv(x,y) and 7v(y,x) 
represent the same section of it (see page 4). 

The paths (x,q,b,y) in Q±, Q2 and Q3 from Figure 4 are examples of 
discriminating paths for b. Like an inducing path, a discriminating "path" 
n = (x,qi, . . . ,q p , b, y) is, in fact, a collection of paths 

x?-^gi-<-> <-yqj—yy (1 < j < p), 

xl—yqi^—>- ■ ■ ■ -<-yq p ^-?b?-yy, 
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together with the (additional) requirement that the endpoints x and y are 
not adjacent. Consider a discriminating path iv = {x,qi, . . . ,q p ,b,y) in an 
ancestral graph Q. If a given set Z (x,y £ Z) does not contain all vertices 
q%, 1 < i < P, then, for some j, qj ^ Z and for all k < j, q^ € Z, so that the 
path (x, q±, . . . , qj, y) m-connects x and y given Z (because gi, . . . ,<fy_i are 
colliders and qj is a noncollider) ; see Figure 5. Hence, if Z m-separates x 
and y then {q±, . . . , q p } C Z. Consequently, if 6 is a collider on the path 7r in 
the graph Q and Z m-separates x and y, then b ^ Z; otherwise, the path 7r 
would m-connect x and y, since every nonendpoint vertex on 7r would be a 
collider and in Z. Conversely, if b is a noncollider on the path 7r, then b is 
a member of any set Z that m-separates x and y. 

Thus, whenever (x, q%, . .. ,q p ,b,y) forms a discriminating path in Q, then 
b is a collider [noncollider] if and only if every set Z m-separating x and y is 
such that b £ Z [b G Z] . It follows that \{ Q* ^ Q and the path corresponding 
to 7r, say 7r*, also forms a discriminating path for b in (/*, then 6 is a collider 
on 7r* (in Q*) if and only if b is a collider on iv (in £?). Thus, we have proved 
the following. 

Lemma 3.9. Let 7r = (x,q±, . . . ,q p ,b,y) be a discriminating path for b in 
the MAG Q. If Q* is a MAG, Q* ~ Q , and the corresponding path n* forms 
a discriminating path for b in Q* , then b is a collider on iv in Q if and only 
if b is a collider on n* in Q* . 

Thus, in general, even though q p and y are adjacent, {q p ,b,y} is "discrim- 
inated" by the path it to be of the same type (collider or noncollider) on the 
corresponding path in any graph Q* Markov equivalent to Q in which the cor- 
responding path 7r* also forms a discriminating path. Though discriminating 




Fig. 5. The unshielded noncolliders (x,qi,y) and the sequence of discriminating paths 
for the noncolliders (qj- 1 ,qj,y) (1 <j <p). See Lemma 3.10 and Corollary 3.14- 
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paths can exist in DAGs, they are not important for determining Markov 
equivalence, because such paths always discriminate noncolliders (see G3 in 
Figure 4). If {x,q,b) forms a collider, then since there are no bi-directed 
edges in a DAG, it follows that b is a parent of q. 

The following lemma gives a sufficient condition under which the path ir* 
corresponding to a discriminating path 7r in a MAG Q will also be discrim- 
inating in another Markov equivalent MAG Q* . 

Lemma 3.10. If ir = (x,qi, . . . ,q p , b, y) is a discriminating path in a 
MAG Q, then, in any MAG Q* with Q* ~ Q in which the qi are colliders 
on the corresponding path 7r* , the edges between qi and y in Q* are of the 
form qi — yy, (l<i<p). 

Proof. The proof proceeds by induction on i. First, consider the (qi,y) 
edge in Q* . If there is an arrowhead at q±, then {x, qi,y) forms an unshielded 
collider in Q* but an unshielded noncollider in Q. But then, by Proposition 
3.6, Q and Q* are not Markov equivalent, which is a contradiction. Since 

x?—yqi — ?y, but q± y is ruled out by Definition 2.1(c), we have q± — >-y 

in Q*. 

Suppose that qj — >-y for 1 < j < i in Q* . Then, the path (x,q±, . . . ,qi,y), 
i <p forms a discriminating path for qi in both Q* and Q. If q^—ly in Q* , 
then {qi_i,qi,y) forms a collider in Q* but a noncollider in Q. But then, by 
Lemma 3.9, we have Q 9^ Q* , which is a contradiction. Since qi-i~<—yqi — ??/, 
but qi y is ruled out by Definition 2.1(c), we have qi — >-y in Q* as re- 
quired. □ 

One might hope that, if Q\ ~ Q2, then Q\ and Q2 would have the same 
discriminating paths. Unfortunately, this is not the case. It is possible for a 
path 7r to be discriminating in Q, and yet the corresponding path 7r* not be 
discriminating in Q* even though Q ~ Q* . Hence, the premise in Lemma 3.9 
will not hold for all pairs of Markov equivalent graphs. Thus, the fact that 
a noncollider is discriminated by a path in Q does not mean that it will be 
present in every graph Markov equivalent to Q . 

Consider the example given by the two graphs in Figure 6(i). Note that 
q is a collider on the path (x,q,b,y) in Qi, but not in Q2; (x,q,b,y) forms 
a discriminating path in Qi, but not in Q2, though Q\ ~ Q2. Hence, although 
(q,b,y) is a noncollider in any graph Markov equivalent to Gi in which 
(x, q, b, y) forms a discriminating path for b, (q, b, y) need not be a noncollider 
in graphs such as G2, where the corresponding path is not discriminating for 
b. 

However, we conjecture that if a collider is discriminated by some path in 
Q, then this collider will be present in every graph Q* Markov equivalent to 
G, regardless of whether there is a discriminating path for this collider in Q* 
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x — ? ~y x ? •>> 

Fig. 6. 2\uo examples of maximal ancestral graphs that are Markov equivalent where 
(x,q,b,y) forms a discriminating path in Qi, but not in t/2- 

or not. For example, the collider (q,b,y) in the graph Qi, shown in Figure 
6(h), is present in every graph Markov equivalent to Qi, even though the 
path (x,q,b,y) does not always form a discriminating path, as in Q2, shown 
in Figure 6(h). 

The results in this section present a dilemma; it is clear that discriminat- 
ing paths, when present in both graphs, lead directly to necessary conditions 
for Markov equivalence. However, a discriminating path for a given triple 
may not be present in all graphs within a Markov equivalence class. We 
avoid this problem by identifying, via a recursive definition, a sub-class of 
discriminating paths and associated triples (those "with order") that are 
always present, and by showing that, in conjunction with the conditions in 
Proposition 3.6, these triples provide sufficient conditions for determining 
Markov equivalence. 

Definition 3.11. Let D, (i > 0) be the set of triples of order i in a 
MAG Q, defined recursively as follows: 

Order 0. A triple (a, b, c) € Do if a and c are not adjacent. 
Order i + 1. A triple (a, 6, c) G Dj+i if 

(1) for all j <i + 1, (a, 6, c) ^ Dj, and, 

(2) there is a discriminating path (x,q±,..., q p , b, y) for b with 
either (a,b,c) = {q p ,b,y) or (a,b,c) = (y,b,q p ) and the p 
colliders 

(x,q 1 ,q 2 ),...,(q p -i,q p ,b) G [jOj. 

If (a,b,c) £ Di then the triple is said to have order i. If a triple has order i 
for some i, then we will say that the triple has order. A discriminating path 
is said to have order i if, excepting (q p ,b,y), every collider on the path has 
order at most i — 1, and at least one collider has order i — 1. 
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For example, in every graph in Figure 4, the triple {x,q,b) has order 0, 
while (q,b,y) has order 1. It is important to note that not every triple in 
a graph will have an order. For example, in all the graphs in Figure 6, the 
triples (x,q,b) and (q,b,y) do not have order. However, it is possible for a 
triple without order to be of the same type (collider or noncollider) in every 
graph in the Markov equivalence class, such as triple {q,b,y) in Figure 6(h). 
Note that the order (if any) of a shielded triple is the minimum of the orders 
of all discriminating paths (with order) for that triple. 

We now show that a necessary condition for two graphs to be Markov 
equivalent is that they have the same colliders with order. 

Proposition 3.12. If (a,b,c) has order r in a MAG Q, then (a,b,c) 
has order r in any MAG Q* , with Q* ~ Q, and, further, (a,b,c) is a collider 
in Q if and only if (a, b, c) is a collider in Q* . 

Proof. The proof is by induction on r, the order of (a, b, c). For r = 0, 
the result follows from Proposition 3.6. For r > 0, by Definition 3.11, there 
exists a discriminating path 7r = (qo, . . . , q p = a,b, c) or (qo, . . . ,q p = c,b, a) in 
Q such that, with the possible exception of {a,b,c), every other triple on 7r 
is a collider and has order less than r. By the induction hypothesis, in Q* 
these triples have the same order as in Q and also form colliders. By Lemma 

3.10, since the q^s (i > 0) are colliders on the corresponding path n* in Q* , 
qi — >~y (1 <i<p) in Q* . Thus, 7r* also forms a discriminating path in Q*, 
and so (a,b,c) has order at most r in Q* . However, if (a,b,c) has order less 
than r in Q* , then, by the inductive hypothesis (applied to Q*), {a,b,c) will 
have lower order than r in Q, contrary to assumption. Thus, (a,b,c) has 
order r in Q* . The result follows by Lemma 3.9. □ 

Lemma 3.13. If MAGs Q\ and G2 have the same adjacencies and are 
such that: 

(i) every collider with order in Q\ is a collider in Q2, and 

(ii) every collider with order in Q2 is a collider in Gi, 

then, for all r > 0, (a,b,c) is a collider [noncollider] with order r in Q\ iff 
(a,b,c) is a collider [noncollider] with order r in Q2. 

It will follow from Lemma 3.13 and Theorem 3.7 below that conditions 
(i) and (ii), together with the same adjacencies, are sufficient for Markov 
equivalence. 

Proof of Lemma 3.13. We argue by induction for each order r. 

(r = 0). Suppose (a,b,c) is a triple of order in Q\ \G2\- By Definition 

3.11, a and c are not adjacent in Q\ [C/2]. Hence, a and c are not adjacent in 
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G2 [Gi], so (a, b, c) has order in G2 [Gi]- If {a, b, c) forms a collider in Gi [Gi], 
then, by (i) [(h)], it forms a collider (with order 0) in G2 [G\\- Conversely, if 
(a,b,c) forms a noncollider in Gi [G2], then, since it also has order in G2 
[Gi], by (ii) [(i)], (a,b,c) cannot be a collider in G2 [Gi]- 

(r > 0). Suppose the result holds for all s < r. If {a,b,c) is a triple with 
order r in G\ [G^, then there is a discriminating path /i. = (q , qi, . . . , q p , b, y), 
where either q p = a and y = c, or q p = c and y = a, and each collider % 
(1 < i <p) on /j, has order less than r by Definition 3.11. By the induction 
hypothesis, each collider qi is also a collider on the corresponding path /x* 
in G2 [Gi] with the same order as in Gi [^2]- 

We claim that /x* also forms a discriminating path in G2 [Gi]- Since we 
have qo7—yqi~<—>-- ■ — <—>~q p ^—7b in G2 [Gi], it suffices to show that qj — >-y 
(1 <j<p) in G2 [Gi] (see Figure 5). Triple (<7o,<7i,y) is a noncollider with 
order in Gi [G2] because qo and y are not adjacent. Hence, by the inductive 
hypothesis, (qo,qi,y) is a noncollider (with order 0) in G2 [Gi]- Further, by 
Definition 2.1(c), q\ — >~y in G2 [^1], because qtf— >-qi- Arguing inductively, 
assume that q^ — >~y (l<i < j) in G2 [Gi] so that (qo,qi, ■ ■ ■ ,qj,y) forms a 
discriminating path with order at most r for (qj~i,qj,y) in both graphs. 
Consequently, if (qj~i,qj,y) formed a collider in G2 [Gi], then (qj_i,qj,y) 
would be a collider with order at most r in G2 [Gi] but a noncollider in Gi [G2}, 
contrary to (ii) [(i)]. Since qj-\l—>-qj and (qj-i,qj,y) forms a noncollider, 
by Definition 2.1(c), qj — >-y in G2 [Gi]- 

Hence, /j,* forms a discriminating path with order at most r in G2 [Gi], 
so (a,b,c) has order at most r in G2 [Gi]- However, if (a,b,c) has order less 
than r in G2 [Gi], then, by the inductive hypothesis, (a,b,c) will have lower 
order than r in Gi [G2], contrary to assumption. Thus, (a,b,c) has order r 
in both graphs. 

Now, if (a,b,c) is a collider in Gi [G2], then, by (i) [(h)], (a,b,c) is also 
a collider in G2 [Gi]- Conversely, if (a,b,c) is a noncollider in Gi [G2], then it 
cannot be a collider in G2 [Gi] as that would violate (ii) [(i)]. □ 

Corollary 3.14. If MAGs Gi and G2 have the same adjacencies and 
(a, b, c) is a collider with order in Gi iff (a, b, c) is a collider with order in 
G2, then (a,b,c) is a noncollider with order in Gi iff (a,b,c) is a noncollider 
with order in G2 ■ 

Proof. This follows directly from Lemma 3.13. □ 

Though Proposition 3.12 appears similar to Corollary 3.14, the premise 
in the former assumes the two graphs are Markov equivalent, while in the 
latter it does not. 
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3.5. Discriminating sections of a path. It follows from Proposition 3.12 
that having the same colliders with order is a necessary condition for Markov 
equivalence. As a step toward showing that this condition (together with the 
same adjacencies) is sufficient, we will show that every triple on a "mini- 
mal" m-connecting path has order (see Section 3.6). We first consider, in 
general, the relationships between different sections of a given path, where 
the endpoints of each section are distinguished. 

Let 7r be a path with endpoints I, r. Let &„ = {{xi,bi) | 1 < i < m} be a 
set of ordered pairs of vertices on 7r, such that: (a) Xi = bj implies i ^ j, and 
(b) the b{ are distinct and not endpoints of 7r. Define a relation on the bi in 
6 ff : b s ~< n b t if b s is a nonendpoint vertex on the section 7v(xt,bt). 

Lemma 3.15. With G n and -< n as defined, if b\ -<„■ ■•■ ~< w b m ~< w b\ 
then there exist b s ,bf such that b s ~< n bt ~< n b s , b s is on tt(1,x s ), and bt is on 
ir{x t ,r). 

Proof. It follows, from (b), that for a given b there is at most one 
x such that (x,b) £ ©„■. Let L = {b \ (x,b) £ &Tr,b is on tv(1,x)}; similarly, 
let R = {b | (x,b) £ ©7r,& is on n(x,r)} (see Figure 7). R fl L = by (a) 
and (b). because if bi,bj G R and 6, ~< n bj then bj is closer to r than 

bi on 7r, but if bi, . . . , b p € R, then bi -< n ■ • ■ <^ b p ^ n b\ implies that b\ is 
closer to r than b\, a contradiction. Similarly, R ^ 0. Let b s be the vertex 
in L that is closest to r. Now, define B = {b \ (x, b) £ &„, b b s }; 
because b s * -K^ b s where s* = s — l(mod?n). By definition of b s , B C R. 
Let X = {x | for some b G B, (x, b) € & n } ■ Let xt be the vertex in X that is 
closest to I, and bt be a corresponding vertex in B, so that (xt,bt) £ 6 ff . It is 
sufficient to prove that xt is on 7v(l, b s ), but xt^b s , since then 6 S bt -<-w b s 
as required (see Figure 7). Suppose, for a contradiction, that x t is on 7v(b s ,r). 




Fig. 7. Illustration of the proof of Lemma 3.15. Lines indicate sections ir(xi,bi); filled 
circles are bi 's, open circles are Xi 's. Indicated are those sections for which the b endpoint 
(filled circle) belongs to L, R and B. See proof for further explanation. 
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Let bk be the vertex in B that is closest to b s [bk ^ b s by (b)]. Since, by 
hypothesis, xt is on ir(b s ,r), it follows by definition of xt, that Xk is also 
on 7v(b s ,r). By hypothesis, bk* bk with k* = k — l(modm). However, 
6^* L, since, by definition of b s , any vertex bi £ L is on 7r(7, b s ). If bk* £ i? 
then bk* £ -B because and bk are both on 7r(6 s ,x s ). But then 6fc is not the 
vertex in B closest to b s on 7r, which is a contradiction. □ 

We now consider the special case of the development above, in which 
Stt = {{xi,bi) | for some yi,7r(xi,yi) is a discriminating path for bi}. 

(3.1) 

In this context, by definition of a discriminating path, if bi -< w bj then 6j is 
a collider on the discriminating path 7v(xj,yj) for bj, with xj,bi,bj and in- 
distinct vertices; bj and j/j are adjacent (by the naming convention on page 
10); both bi and bj are in shielded triples on 7v. That &„ still satisfies (a) 
and (b) follows from the definition of a discriminating path together with 
the following. 

Proposition 3.16. In a MAG Q, if (a,b,c) is a section of a path 7r 
between x and y, and a and c are adjacent in Q , then there is at most one 
vertex v on tv such that either tv(v,c) or 7r(v,a) forms a discriminating path 
for b. 

Proof. If there is some discriminating path for {a,b,c) then a is either 
a parent or child of c. In the former case, v is uniquely determined as the 
closest vertex to a on tv(x,c) that is not a parent of c. The other case is 
symmetric: v is the vertex closest to c on iv(a,y) that is not a parent of a. 
□ 

From here on, 6 W and ^ will refer to (3.1). We now prove that, as the 
symbol -K^ suggests, this relation between discriminating paths is acyclic. 

Corollary 3.17. On a path n in a MAG Q, with 6 W given by (3.1), 
there is no sequence of distinct vertices &2j ■ • ■ j bk),k > \, such that bi ~< n 
bi+i, 1 < i < k, and bk ~< n b\. 

This acyclic property is central to establishing that every triple on a 
"minimal" m-connecting path has order; see Lemma 3.21. Note, however, 
that the relation is not transitive in general. 

Proof of Corollary 3.17. By Lemma 3.15, it is sufficient to prove 
that there is no pair of distinct vertices {61,^2} such that b\ ~< n 62 and 
^2 ^tt b\. For a contradiction, suppose that there is such a pair {61,62} (see 
Figure 8). 
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Fig. 8. Diagram for proof of Corollary 3.17. 



By maximality, x\ 7^2/2 and X2 7^2/1; otherwise, 7r(xi,yi) or Tr(x2,y2), 
respectively, would form an inducing path with nonadjacent endpoints. We 
now reach a contradiction because (i) 7/2 lies on 7r(xi,yi) and, hence, is a 
parent of y\, but (ii) y\ lies on Tr{x2,y2) and, hence, is a parent of y2- □ 

3.6. Minimal m-connecting paths. We next study the structure of "min- 
imal" ?n-connecting paths and examine which nonconsecutive vertices on 
such a path may be adjacent. 

Definition 3.18. In a MAG, a path fj,, m-connecting x and y given Z, 
will be said to be minimal if no order preserving (proper) subsequence of 
the vertices on /x forms an m-connecting path between x and y given Z. 

It is simple to see that if there is some path m-connecting x and y given 
Z, then there is a minimal path which m-connects x and y given Z. If 
li = («!,..., v p ) is a path, then we will refer to any pair of vertices (yi , Vj ) 
for which \i — j\ > 1 as nonconsecutive vertices on /x. As the next lemma 
shows, on a minimal m-connecting path, only certain nonconsecutive vertices 
may be adjacent. 

Lemma 3.19. Let tt be a minimal m-connecting path between a and b 
given Z in the MAG Q . If i and j are two nonconsecutive vertices on tt that 
are adjacent in Q (a = i or j = b are possible) then exactly one of i and j is: 
(i) a collider on tt, (ii) in Z and (iii) a parent of the other vertex. 

Note that the existence of nonconsecutive vertices on a minimal ?n-connecting 
path implies that there are at least four vertices on the path. Lemma 3.19 
is illustrated in Figure 9. 

Proof of Lemma 3.19. Suppose that j is on 7v(i,b); the other case 
is symmetric. Let rj be the path formed by concatenating 7r(a,i) with the 
(i,j) edge and 7v(j, b) (omit the relevant section if i = a or j = b). Define the 
status of a vertex to be one of either an endpoint, a collider or a noncollider. 

Suppose i has the same status along r] as it does along tt, and similarly 
so for j. Then, clearly, both tt and rj are m-connecting given Z, but 77 is 
shorter than tt, thereby violating the minimality of tt. Hence, at least one 
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of i and j has a status on rj different from that on tv. Without loss of 
generality, suppose it is i; again, the other case is symmetric, i is not an 
endpoint, because tv and r\ have the same endpoints. It follows that either 
i is a collider on 77 and i ^ an(Z), or i is a noncollider on 77 and i € Z. 

Suppose the former, so i ^ an(Z), i is a collider along 77, but i is a non- 
collider along 7r. Since i is a collider on 77, and Tv(a,i) = T](a,i), there is 
an arrowhead at i on 7r(a,i). Then by Lemma 2.4, since i £ an(Z), Tv(i,b) 
forms a directed path from i to b. But j is on ir(i,b), and z is a collider on 
77; hence, j"?—yi — > yj which violates Definition 2.1(a), (b). 

Hence, ieZ, i is a noncollider along 77, but i is a collider along tv. 
Thus, i — ?j in Q. Finally, the edge cannot be undirected because 

a?— ?•••?— yi j violates Definition 2.1(c); hence, i — >-j. □ 

3.7. Discriminating paths on minimal m-connecting paths. The next lemma 
shows that, if a triple (d, b, y) on a minimal m-connecting path 7r is shielded, 
then a subsequence of the path forms a discriminating path for b. Thus, in 
the notation of Section 3.5 on a minimal m-connecting path in a MAG, the 
following holds: 

(d, b, y) a shielded triple on n ==> there exists a nonendpoint vertex a 



Lemma 3.20. Let it be a minimal m-connecting path between u and v 
given Z in the MAG Q. If (x,b,y) is a triple along tv and x is adjacent to 
y, then tv contains a unique section that forms a discriminating path for b. 

It follows, from Lemma 3.19, that, with the possible exception of b, every 
nonendpoint vertex on the section forming a discriminating path is in Z. 

Proof of Lemma 3.20. Suppose, for a contradiction, that no such 
unique section exists. By Lemma 3.19, at least one of x and y is: (i) a collider 
along 7r, (ii) a vertex in Z and (iii) a parent of the other vertex. Without 
loss of generality, suppose x is the vertex satisfying (i), (ii) and (iii). Since 



Fig. 9. Example of a minimal m-connecting path (indicated by thicker edges). Here, Z 
is the set of colliders on the path. 



on 7r such that a -K^b. 
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Fig. 10. The path tt from u to v contains a unique section forming a discriminating 
path for b in Q . See Lemma 3.20 for further explanation. 

x is a collider on 7r, we have x~<—?b in Q. Further, since bf—yx — yy?—?b in 
G, by Lemma 2.2 we have bi—>-y, as shown in Figure 10. 

Let qo = x and let i be such that qi is the vertex nearest b on tt(u, b) that 
does not satisfy at least one of the conditions (i), (ii) and (hi) satisfied by 
x. Such a vertex exists because u is an endpoint and thus does not satisfy 
(i). Hence, qi is a vertex on 7v(u,qo) but qi ^ go- 

We now show that qi is not adjacent to y. Suppose otherwise. Since q^—y 
qi—\ — yy, by Lemma 2.2, we have qii—>-y. By Lemma 3.19, (i), (ii) and (iii) 
are satisfied so qi is a collider on 7r (hence, qi^u), qi £ Z and % — >~y. But 
this contradicts the definition of q{. 

Hence, ir(qi,y) forms a discriminating path for b. Uniqueness follows from 
Proposition 3.16. □ 

3.8. Triples on minimal m-connecting paths. We now prove that, in a 
MAG, Q every triple on a minimal m-connecting path has an order, and 
thus, by Proposition 3.12, is of the same type in every MAG Q* with Q* ~ Q. 

Lemma 3.21. If (a,b,c) is a triple on a minimal m-connecting path n 
between x and y given Z in the MAG Q, then (a,b,c) has order. 

Proof. Suppose, for a contradiction, that (a, b, c) does not have order. 
Then, a and c are adjacent; otherwise, (a, b, c) is unshielded, and, hence, 
is of order 0. It follows from Lemma 3.20 that there is a unique section 
of 7r which forms a discriminating path for (a,b,c). If every triple on this 
discriminating path has order, then, by definition, (a,b,c) has order. Hence, 
there is at least one triple which does not have order, call this (oi,6i,ci). As 
before, it follows that a\ and c\ are adjacent, and, hence, there is a unique 
section of 7V which forms a discriminating path for (oi,6i,ci). Arguing in 
this way, we can construct an infinite sequence of shielded triples on 7r, 
(ai,bi,Ci) (i £ N), none of which have order and such that 

• • • -<tt bi • • • <k b\ -< n b. 

However, by Corollary 3.17 all of the frj's are distinct, which is a contradiction 
since tv is finite. Thus, every triple on tv has an order. □ 

Note that this argument shows that every triple on a minimal m-connecting 
path 7r has some order and also that this order is bounded by the number 
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of vertices on ir\ see page 28. Though we will at no stage need to do so, note 
that to determine which order a given triple on it has, it might be necessary 
to consider other discriminating paths for the given triple, not merely those 
which are sections of it . 

Corollary 3.22. Suppose that Q\ and G2 o- r ^ MAGs with the same ad- 
jacencies and the same colliders with order. If it is a minimal m-connecting 
path between x and y given Z inQ\, then (a, b, c) is a collider [noncollider] on 
7r in Q\ if and only if (a,b,c) is a collider [noncollider] on the corresponding 
path 7r* in Q2. 

Proof. This follows directly from Corollary 3.14 and Lemma 3.21. □ 

3.9. Directed paths from colliders to vertices in Z . In this section, we 
establish that if there is an m-connecting path tt between x and y given Z 
in Q, then we can always find a path 7r m-connecting x and y given Z in Q 
such that, if c is a collider on 7r, then c is an ancestor of a vertex in Z in 
any graph Q* which contains the same adjacencies and the same colliders 
with order as Q. 

Let |7r| be the length of a path (i.e., the number of edges on it). Let 
D (f>, Z) be the set of directed paths from b to some vertex in Z. S E Q(b, Z) 
is said to be a minimal directed path with respect to Z if |<$| = min^gj)^ ^ |<J|. 
Let 

f 0, if b e Z, 

<j>(b,Z) = \ m in \g\ if b € an(Z) \ Z. 
{ 8e®(b,z) 

If it m-connects given Z, then let 

<P(tt,Z)= ^ Z )- 

b a collider on 7r 

We now construct an ordering on the set of paths m-connecting given Z: 
7Ti<^7r 2 <J=> I vri I < |7T 2 or 

[tti [ = |tt2 I and </>(7Ti, Z) < 0(7T2, Z). 

Definition 3.23. In an ancestral graph, an m-connecting path it be- 
tween x and y given Z is said to be a closest m-connecting path to Z if there 
is no other path tt* m-connecting x and y given Z such that it* <^.z 71 '• 

Proposition 3.24. In an ancestral graph, if there is an m-connecting 
path 7T between x and y given Z, then there is an m-connecting path it that 
is closest to Z . Every such path is also a minimal m-connecting path given 
Z. 
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Fig. 11. Diagram for the proof of Lemma 3.25. Either (x, ... ,a m -i,b* ,c n _i, ... ,y) is an 
m- connecting path closer to Z, or at least one of the noncolliders (a\,b,b*) and (a,b,b*) 
has order. (Note that ao = b = cq and m, n > by construction. ) 

Proof. Existence of a closest path tv is immediate since <Cz is an or- 
dering on the finite and nonempty (by hypothesis) set of paths m-connecting 
x and y given Z. Minimality follows, because if there were an m-connecting 
path 7r* formed from an order preserving (proper) subsequence of the ver- 
tices on 7r then |7r*| < | -7T j , so 7r* <Cz 7T) which is a contradiction. □ 

Lemma 3.25. If, in a MAG Q:tt = (x,...,y) is a closest m-connecting 
path to Z; {ai,b,c\) is a collider on it; and 8 = (b,b* , . . . , z) is a minimal 
directed path with respect to Z from b to some z € Z ; then at least one of 
the noncolliders a\i—yb — yb* or &*-< — b<—7c\ has order in Q . 

Proof. By Proposition 3.24, ir is a minimal m-connecting path be- 
tween x and y given Z. Now, suppose for a contradiction that neither triple 
ai?— >-b — yb* nor b*~< — &-<— ?ci has order. Then, a\ is adjacent to b* and by 
Lemma 2.2 we have a\t— yb*. Similarly, c\i— yb* . 

Define ao = 6, and let a m be the vertex along tt(x, b) that is furthest from 
b such that for all k, < k < m: (i) is a collider on ir, and (ii) a^ — yb* (see 
Figure 11). Such a vertex a m exists because oq = b satisfies the conditions 
for afc; note that m > 0. Then, the following hold: 

(1) a m is adjacent to b* . Otherwise for m = 1, (ai,b, b*) is unshielded; or 
for m > 1, (a m , . . . ,a\,b,b*) forms a discriminating path with order for 
(a±,b,b*) (by Lemma 3.21). In either case, (a\,b,b*) would have order 
which is a contradiction. 

(2) Since a m ?— ya m -i — yb* , by Lemma 2.2, we have that o m i—yb* is in Q. 

(3) If a m ^x, then triples (a m +i, a m , a m -i) an d (o m +i, a m , b*) are of the 
same type (collider/noncollider) where a m+ i is the predecessor of a m 
along 7r(x,a m ): if a m ~<—yb* then, since a m l— ya m _\ — yb*, by Lemma 
2.2 we have that a m ~<— ya m -\. If a m — yb*, then by the definition of a m , 
triple (a m+ i,a m ,a m -i) is not a collider. 

Define cq = b. Let c n be the vertex along 7v(b,y) that is furthest from b 
such that, for all j, 0<j<n: (i) Cj is a collider on it, and (ii) c,- — yb*. By 
symmetric arguments to (1), (2) and (3), we may show that c n l—yb*, and 
either c n = y or the triples (c n _i, c n , c n+ \) and (b* , c n , c n+ \) are of the same 
type, where c n+ \ is the successor of c n on the path ir{b,y) (see Figure 11). 
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Let 77 be the path formed by concatenating the section tv(x, a m ) to a m ?—>-b* 
-<— ?c n and n(c n ,y) (if x = a m , or c n = y then omit the relevant sections). 
77 forms an m-connecting path given Z because a m and c n have the same 
status on r? as they have on 7r, and b* is an ancestor of Z. However, since 
|r?| < I vi - 1 and (f>(ri,Z) < 4>(iv,Z), 77 -Cz ^, which is a contradiction. □ 

Lemma 3.26. In a MAG Q if 5 is a directed path from v to z 6 Z and 
8 is minimal with respect to Z, then every noncollider on 8 is unshielded 
(= order 0). 

Proof. Suppose that {a,b,c) is a noncollider on 8 and a — yb — >-c. If 
a and c are adjacent then, by Definition 2.1(a), (b), we have a — yc, which 
contradicts the minimality of 8, with respect to Z. □ 

Though not needed, in fact no nonconsecutive vertices on 8 are adjacent. 

Corollary 3.27. Let Q±, G2 be MAGs with the same adjacencies, and 
the same colliders with order. If in Q\:~k m-connects x and y given Z ; it 
is a closest path to Z; (a,b,c) is a collider on it; and 8 forms a directed 
path from b to a vertex z G Z that is minimal with respect to Z ; then the 
corresponding path 8* is a directed path in Q2. 

Proof. By Proposition 3.24, tv is a minimal m-connecting path. Let 
8 = (b = do, . . . ,d n = z) . The proof is by induction on the edges {di,di + i) of 
8*. 

Base case (i = 0). Since {a,b,c) is a collider on 7r in Q\, and 7v is minimal, 
by Corollary 3.22, (a,b,c) is also a collider in Q2. By Lemma 3.25 at least 
one of the noncolliders, (a, b, d±), {c,b,d\) has order in Qx, and by Corollary 
3.14 is also a noncollider in £/2- It follows by Definition 2.1(c) that b — >-d\ 
in Q2 as required (so in fact {a,b,di) and {d\,b,c) are both noncolliders in 
G2). 

Inductive case (1 < i < n). Assume that the section 8*(b,di) forms a di- 
rected path from b to di in Q2. By Lemma 3.26 the noncollider 1, di, dj+i) 
has order, and hence is a noncollider in 02- By the induction hypothesis we 
have di-i — y-d. L in Q2; hence, by Definition 2.1(c), di — yd.- L+ i in Q2, as re- 
quired. □ 

3.10. Characterization of Markov equivalence. We now prove the main 
result of this paper, Theorem 3.7. 

Proof of Theorem 3.7. (if) Since Q\ and Q 2 have the same adjacen- 
cies and colliders with order, by Corollary 3.14, Q\ and Q2 also have the same 
noncolliders with order. By definition, X is m-separated from Y given Z if 
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and only if for all x € X , y £ Y, x is m-separated from y given Z. Thus, it is 
sufficient to show that x and y are m-connected given Z in Gi if and only if 
x and y are m-connected given Z in Q2 ■ If x and y are m-connected given Z 
in £/i, then, by Proposition 3.24, there exists a path tt which m-connects x 
and y given Z, is minimal and is closest to Z in Q\. By Corollary 3.22 every 
triple on tt is of the same type on the corresponding path tt* in Qi- Hence, 
every noncollider on tt* is not in Z. Since tt is m-connecting, every collider b 
on tt is an ancestor of Z; hence, if b £ Z then there exists a directed path 6b 
from b to some vertex 2% € Z that is minimal with respect to Z. By Corol- 
lary 3.27, the corresponding path 6% forms a directed path from b to Zj, in 
Qi- Thus, every collider on tt* is an ancestor of Z in Q2 and 7r* m-connects 
x and y given Z in £/2- Likewise, it is easy to see (by symmetry) that an 
m-connecting path in Q2 implies that there is an m-connecting path in Q\. 
Thus, Q\ and Q2 are Markov equivalent. 

(only if) Conversely, if Q\ and Q2 are Markov equivalent, then, by Propo- 
sition 3.6, they have the same adjacencies, and, by Proposition 3.12, they 
have the same colliders with order. □ 

COROLLARY 3.28. Two ancestral graphs Q\ and Q2 are Markov equiva- 
lent iff the corresponding unique MAGs Q\ and Q2 of which Q\ and Q2 are, 
respectively, subgraphs and to which they are Markov equivalent, satisfy the 
conditions given in Theorem 3.7. 

4. Related work and computational complexity. Two prior characteri- 
zations of Markov equivalence for MAGs have been given in the literature. 

Theorem 4.1 [19]. Two MAGs Q\ and Q2 are Markov equivalent if and 
only if: 

(i) Qi and Q2 have the same adjacencies; 

(ii) Qi and Q2 have the same unshielded colliders; and 

(iii) if tt forms a discriminating path for b in Q\ and Q2, then b is a 
collider on tt in Q\ if and only if it is a collider on tt in Q2 ■ 

More recently, [24] gave the following elegant characterization. 

Theorem 4.2 [24]. Two MAGs Q\ and Q 2 are Markov equivalent if and 
only if Q\ and Q2 have the same minimal collider paths. 

Here, a collider path v = (v\, . . . ,v n ) is minimal if there is no order pre- 
serving subsequence (v\ = , . . . , vi k = v n ) , which forms a collider path (sin- 
gle edges are trivially minimal collider paths). 

However, neither of these characterizations lead to a polynomial time al- 
gorithm. Clause (iii) in Theorem 4.1 requires us to verify that, if there is 
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a discriminating path in both Q\ and Q2, then the triple discriminated is 
a collider or noncollider in both. Thus, in principle, we need to find every 
discriminating path for a given triple; otherwise, it is possible that, although 
a triple is discriminated by some path in Q\ and some path in Q2, in fact 
there is no discriminating path that is common to both graphs. Since the 
number of such paths may grow at super-polynomial rate, finding them all 
would not be feasible in polynomial-time. (Reference [19] outlined a method 
for checking Markov equivalence using the conditions of Theorem 3.7, rather 
than Theorem 4.1, though the paper only proves the latter result. The com- 
putational complexity claim in that paper was also incorrect.) 

Similarly, it is not hard to show that the number of minimal collider paths 
in a graph may grow super-polynomially with the number of vertices so the 
conditions in Theorem 4.2 cannot, in general, be verified in polynomial time 
(see supplementary material [1]). 

In the Appendix, we provide an algorithm that verifies the conditions in 
Theorem 3.7 in 0(ne 4 ) calculations, where the graphs have n vertices, and e 
edges. For a general, not necessarily maximal, ancestral graph Q the unique 
MAG Q of which it is a subgraph and to which it is Markov equivalent may 
be found in 0(n 5 ) time; thus, the conditions in Corollary 3.28 may also be 
checked in polynomial time. 

4.1. Summary graphs and MC graphs. Summary graphs, described in 
Cox and Wermuth [5], represent another approach to representing the in- 
dependence structure of DAGs under marginalizing and conditioning. For 
a given summary graph TC, it is always possible to construct a DAG T>(H) 
with additional variables such that the DAG is Markov equivalent to Ti after 
marginalizing and conditioning. Consequently, it is always possible to trans- 
form a summary graph into an ancestral graph via the graphical transfor- 
mation mentioned in Section 2.6. Hence, via this transformation, the results 
in this paper also provide an algorithm for determining the Markov equiva- 
lence of two summary graphs. We note that in general it may not be possible 
to recover the summary graph from the corresponding ancestral graph (see 
[15], Section 9). 

Koster introduced another class of graphs, called MC- graphs, together 
with an operation of marginalizing and conditioning (see [10, 11]). For MC- 
graphs it is not always the case that there exists some DAG which is Markov 
equivalent to the MC-graph under marginalizing and conditioning. However, 
for the subclass of MC-graphs which are Markov equivalent to DAGs with 
additional variables under marginalizing and conditioning, we may again 
apply the results of this paper to establish Markov equivalence. 
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Table A.l 
The algorithm Reachable(D, w) 



Inputs: 
Output: 


a directed graph D(V, E); an element w G V 
a set § of elements connected to w in D 


1 


§o = 0; §i = {iu}; p=l; 


2 


repeat 


3 


Sp+i = § p U {w2\w! G § p \ Sp-i and (toi,iOa) G E}; 


4 


p = p + l; 


5 


until § p = § p _i; 


6 


return § = § p . 



APPENDIX 
We introduce the following notation: 

2tf>j((?) = {(x,y) | x and y are adjacent in G}, 
£ol(G) = {{x,y,z) | x?-yy-<-?z in £}, 
£)£ol(Q) = {(x, y, z) | (x, y, z) € <tol(Q) and (x, y, z) has order}, 
3£ol(G)= f| £o[(£*), 

which are, respectively, the set of adjacencies, colliders, colliders with order 
in Q and colliders common to all graphs in the Markov equivalence class 
containing G- In general, we have D£ol(G) C 3£ol(G) Q £ol(£). 



Table A. 2 
T/ie algorithm Triples(t/) 



Input: 
Output: 


a maximal ancestral graph Q 

a set of triples T such that 0€oi{Q) CTC J£o[(C?) 


1 


To = {(a,b,c)\(a,b,c) e £o((e), (o,c) £ S»j(<?)}; 


2 


fc = 0; 


3 


repeat 


4 


fc = fc+l; T fc = T fc _i; 


5 


for each (a, 6, c) G £o[(5) \ Tfc_i with a G sp g (i>) n pa e (c): 


6 


V= {(*,«} |t, w Gpa(c),i^-Mi in £/}U{{b,a)}; 


7 


E = {({i,u),(ii,»))|(t,tJ,t)}6T t _ 1 ,(i,ii},(t 1 ,D)6?}; 


8 


§ = Reachable((V, E) , (6, a) ) ; 


9 


X = {x \ 3y,z,(z,y,x) GT fc -i, (z,y) G §}; 


10 


if X\{v\(v,c)€W3)(g)}^0 




then T fc =T k U {{a,b, c), (c, b, a)}; 


11 


until Tfc = Tfc— i; 


12 


return T = Tfc. 
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Table A. 3 
The algorithm Equivalent(<5i, G2) 



Inputs: 


two maximal ancestral graph 


3 Q\ and G2 


Output: 


a Boolean variable indicating 


whether 3 m (Gi) = 3m (£2) 


1 


if 2tt>K5i) ¥= 8l?>j(02) return 


FALSE; 


2 


if Triples(Si)\£o((S 2 )/0 


return FALSE; 


3 


if Triples^) \£o[(Si)/0 


return FALSE; 


4 


return TRUE. 





The equivalence algorithm is described in Tables A.1-A.3. The main 
procedure, Triples (Q), identifies a superset of the colliders with order as 
follows. A discriminating path tv = (x, z, . . . , a, b, c) for the collider (a, b, c) 
(where z may equal a) that is in divides naturally into three parts. First, 
there is a collider {a,b,c), which is not in T^-i- Second, there is a collider 

path 7 = (z = v\ ■<—> <— >-Vj = b) , where v±,...,Vj 6 pa(c) , and the triples 

(vi-i, Vi, Vi + \) £ Tfc_!. The third part is an edge x?—yz, where x is not ad- 
jacent to c and for some y, x?—>~z~<—>~y € Tk-i, and z^—>-y is on the path 
7. Line 5 of Triples((/) locates candidate triples (a,b,c). Steps 6, 7, and 8 
search for collider paths 7. Note that "vertices" (V) and "edges" (E) in D 
correspond to, respectively, edges and colliders in Q. Finally, lines 9 and 10 
search for a vertex satisfying the conditions on x. For further insight into the 
operation of the algorithm, we refer the reader to the proof of correctness. 

Proposition A.l. The algorithm Triples(C/) returns a set T satisfying 
(a) £)€ol(G) C T and (b) T C 3€ol(G). 

Proof of (a). The proof is by induction on the order of the collider. By 
construction, To is the set of unshielded colliders in G, which is the set of col- 
liders of order 0. Our induction hypothesis is that all colliders with order less 
than k > are contained in T^_i, at line 11. If (a, b, c) is a collider with order 
k, then either a — >~c or c — >~a. Suppose the former. Then, there exists a dis- 
criminating path (x = qo, gi, . . . ,q p = a, q p+ \ = b, c) on which {qj-i, qj, Qj+i) 
(1 < j < p) are colliders of order less than k. By definition of a discriminating 
path, (q p -i, a, b) is a collider, as is (a, b, c) , so a € sp(6) . Thus, (a, b, c) satisfies 
the conditions at line 5. In addition, for l<j<p — l, qj,qj+i € pa(c) and 
qj<—>-qj + \, so {qj + i,qj} € V. In addition, (q p+ i,q p ) = (b, a) £ V by construc- 
tion. Since for 1 < j < p, (qj-i,qj,qj + i) is a collider of order less than k, 
it follows by the induction hypothesis that (qj-i, qj, Qj+i) G ^k-i- Thus, 
((qj+i,qj),(qj,qj-i)) € E. Consequently, (<?2,<7i} S S at line 8, since the se- 
quence ((b = q p+ i,a = q p ), (q p , q p -\), . . . , (q 2 , qi)) is found (recursively) by 
calls to Reachable. Since (q2,Qi,x = qo) G it follows that x € X. Fi- 

nally, by definition of a discriminating path, x is not adjacent to c. Thus, 
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the condition in the if clause at line 10 holds, so if (a,b,c) ^ T^-i then it 
is added to TV 

Proof of (b). The proof is by induction on k in the algorithm. We 
show that Tfc C 3£ol(G). When k = 0, To is the set of unshielded colliders, 
so the result follows from Proposition 3.6. For k > our induction hypothesis 
is that Tfc_! C 3€ol(Q). If (a, b, c) € T^ \ TV-i, then either (a, b, c) or (c, b, a) 
(but not both) satisfies the condition at line 5. Suppose the former; the other 
case is symmetric. There exists a triple (x,y,z) € TV-i, with (y,z) £ S, and 
x not adjacent to c. Since (y, z) G S, and iGl, there exists a sequence of 
edges, s = ((6, a), . . . , (z,y), {y,x)) such that each consecutive pair of edges 
in s forms a collider in Tfc_i, all vertices other than b and x are parents of 
c, and all edges other than possibly (y,x) are bi-directed in Q. Note that 
it follows from the inductive hypothesis that all of the colliders formed by 
successive pairs of edges in s are present in any graph Q* Markov equivalent 
to Q. We have thus established that, with the possible exception of the first 
and last edge in the sequence, all these edges are bi-directed in every graph 
in the Markov equivalence class. However, the sequence of edges in s may 
not form a path because the associated sequence of vertices may contain 
repeats. Removing loops leads to a unique path tv with endpoints b and x. 
By construction, b and x only occur in the edges {b, a) and (y, x), respectively 
(since b, x are not parents of c, while all other vertices in the sequence are) ; 
consequently, these edges are on it. Hence, it forms a collider path from x 
to 6, and all of the colliders on this path are present in every graph in the 
Markov equivalence class. By Lemma 3.10, 7r forms a discriminating path in 
every graph Markov equivalent to Q. Thus, by Lemma 3.9, (a, b, c) € 3<£o[{G) 
as required. □ 

Our proof establishes that all triples in Triples [Q) are colliders present 
in every graph in the Markov equivalence class containing Q, which might 
include some colliders that do not have order. If we were able to identify 
any triples in Triples [Q) \ Q<£ol(Q) without increasing the complexity of 
the algorithm, then the algorithm could be made more efficient since it is 
redundant to check for the presence of such colliders in the other graph. 
However, we know of no examples where Triples(^) \ D£ol(G) ^ 0. 

PROPOSITION A. 2. The algorithm Equivalent (^1,^2) returns TRUE iff 
Qi and Q2 are Markov equivalent. 

PROOF, "if" follows from Propositions 3.6 and A. 1(b). "only if" follows 
from Proposition A. 1(a), Lemma 3.13 and Theorem 3.7. □ 
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Reachable(D, w) runs in time 0(e) where e is the number of edges in 
D. The graph B may be represented as a list of adjacencies for each vertex, 
with each edge (wi,W2) being considered at most once at line 3. 

Now, consider the complexity of Triples(£/). Let n and e denote, respec- 
tively, the number of vertices and edges in Q. Any triple appearing on a 
minimal m-connecting path 7r has order at most n — 3: n contains at most 
n vertices; hence, at most n — 2 triples; all of the other discriminating paths 
involved are sections of 7r; and unshielded triples (of which there is at least 
one) are of order 0. Thus, it is always sufficient for Markov equivalence to 
check that two graphs have triples of order less than n. Hence, the outer 
loop, at line 4, in Triples(^) is of complexity 0(n). The number of colliders 
in Q is of 0(e 2 ); hence, the loop at line 5 is executed 0(e 2 ) times (for each 
k). Since E is of size 0(e 2 ), lines 6 to 8 are also of complexity 0(e 2 ). Finally, 
line 9 is 0(e 2 ) [since Tfe_i is of size 0(e 2 )] and line 10 is O(e). Thus, the 
overall complexity is 0(ne 4 ). 
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